The overall admission rate is \(35.8\%\).
Decision:
Originally the plan was to show the applicants’ measures by state.
But due to the size of dataset, the result is quite sparse. So hear we
decide to devide the states into two partitions: East and
West.
Decision:
Decision:
Decision:
The dataset contains 81 valid admission results from the CSV file
SummerStudentAdmissions2.csv.
Three versions of this dataset are included on the right hand side:
Due to the lack of information, some of the variables and contents from the dataset are interpreted intuitively.
In the cleaned dataset,
gender=-1 means the gender is undisclosed.volunteer_level is ranked from 5 to 0.gpa is calculated on a 4.0 scale.writing_score should be on a 100 scale.test_score has rather limited information.work_exp’s unit is year.The dashboard is powered by
flexdashboardDTplotlyhrbrmstr.The codes are open-sourced. Please feel free to star or fork this repository.
---
title: "Admission Dashboard"
output:
flexdashboard::flex_dashboard:
storyboard: true
orientation: columns
vertical_layout: fill
social: ["twitter", "linkedin"]
source_code: embed
theme: bootstrap
logo: static/logo.png
favicon: static/favicon.png
css: style.css
---
```{r setup, include=FALSE}
gc()
rm(list = ls())
if (!require("pacman")) install.packages("pacman")
pacman::p_load(
tidyverse, flexdashboard,
here, styler, patchwork,
hrbrthemes, ggthemes, ggtext, plotly,
glue, waffle, DT, geofacet, ggbeeswarm
)
dat <- read_csv(here("data/data-cleaned.csv"))
style_file("index.Rmd")
# standardize all numeric variables
dat_stand <- dat |>
mutate(
decision = as_factor(decision),
state = as_factor(state),
gender = as_factor(gender),
across(where(is.numeric), ~ round(scale(.)[, 1], 2)),
partition = if_else(
state %in% c("California", "Colorado", "Utah", "Oregon"), "west", "east"
)
)
dat_stand_long <- dat_stand |> pivot_longer(
cols = c(
gpa, work_exp, test_score,
writing_score, volunteer_level
),
names_to = "variable",
values_to = "value"
)
dat_long <- dat |>
mutate(partition = if_else(
state %in% c("California", "Colorado", "Utah", "Oregon"), "west", "east"
)) |>
pivot_longer(
cols = c(
gpa, work_exp, test_score,
writing_score, volunteer_level
),
names_to = "variable",
values_to = "value"
)
```
Insights {.storyboard data-icon="fa-chart-line" data-commentary-width=200}
=====================================
### **Overall admission results.**
```{r, fig.width=8, fig.height=8}
dat |>
count(decision) -> admission_summary
p1 <- ggplot(admission_summary, aes(fill = decision, values = n)) +
geom_waffle(color = "white", size = 1.125, n_rows = 9, flip = TRUE) +
scale_fill_manual(
values = c("#1A6899", "#FC5449", "#FFCF58"),
labels = c("Admit", "Decline", "Waitlist")
) +
coord_equal() +
theme_ipsum_rc() +
theme(
legend.position = "bottom",
legend.title = element_text(color = "#000000"),
legend.text = element_text(color = "#000000"),
axis.title.y = element_text(
family = "IBM Plex Sans", face = "bold"
),
axis.title.x = element_text(
family = "IBM Plex Sans", face = "bold"
),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
text = element_text(
family = "IBM Plex Sans",
color = "#3B372E"
),
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_markdown(hjust = 0.5),
plot.caption = element_markdown(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
title = "SUMMER 2022 ADMISSION RESULTS",
subtitle = "BAD DATA EXCLUDED.",
caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
x = "",
y = ""
)
p1
```
***
The overall admission rate is $35.8\%$.
### **Admission results by gender.**
```{r, fig.width=12,fig.height=8}
# - male 44, admit 15 + decline 16 + waitlist 13
# - female 34, admit 12 + decline 12 + waitlist 10
dat |>
select(decision, gender) |>
mutate(gender = as_factor(gender)) |>
ggplot(aes(gender)) +
geom_bar(aes(fill = decision),
position = position_stack(reverse = TRUE),
width = 0.2
) +
scale_fill_manual(
values = c("#1A6899", "#FC5449", "#FFCF58"),
labels = c("Admit", "Decline", "Waitlist")
) +
scale_x_discrete(labels = c("undisclosed", "female", "male")) +
theme_ipsum_rc() +
theme(
legend.position = "bottom",
legend.title = element_text(color = "#000000"),
legend.text = element_text(color = "#000000"),
axis.title.y = element_text(
family = "IBM Plex Sans", face = "bold"
),
axis.title.x = element_text(
family = "IBM Plex Sans", face = "bold"
),
# axis.text.x = element_text(),
# axis.text.y = element_text(),
text = element_text(
family = "IBM Plex Sans",
color = "#3B372E"
),
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_markdown(hjust = 0.5),
plot.caption = element_markdown(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
title = "SUMMER 2022 ADMISSION RESULTS",
subtitle = "BY GENDER",
caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
x = "gender",
y = "count"
)
```
***
- male 44, admit 15 + decline 16 + waitlist 13
- female 34, admit 12 + decline 12 + waitlist 10
**Decision:**
### **Applicants by geolocations.**
```{r, fig.width=12,fig.height=8}
# dat_long |>
# summary()
p3 <- dat_long |>
ggplot(aes(x = value, color = partition, fill = partition)) +
geom_density(alpha = 0.3) +
facet_wrap(~variable, scales = "free") +
scale_fill_manual(
values = c("#FC5449", "#1A6899"),
labels = c("east", "west")
) +
scale_color_manual(
values = c("#FC5449", "#1A6899"),
labels = c("east", "west")
) +
theme_ipsum_rc() +
theme(
legend.position = "none",
legend.title = element_text(color = "#000000"),
legend.text = element_text(color = "#000000"),
axis.title.y = element_text(
family = "IBM Plex Sans", face = "bold"
),
axis.title.x = element_text(
family = "IBM Plex Sans", face = "bold"
),
# axis.text.x = element_blank(),
# axis.text.y = element_blank(),
text = element_text(
family = "IBM Plex Sans",
color = "#3B372E"
),
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_markdown(hjust = 0.5),
plot.caption = element_markdown(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
title = "SUMMER 2022 ADMISSION RESULTS",
subtitle = "**WEST** VS **EAST**",
caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
x = "candidates' standardized measurement",
y = "density"
)
# ggplotly(p3)
p3
```
***
Originally the plan was to show the applicants' measures by state. But due to the size of dataset, the result is quite sparse. So hear we decide to devide the states into two partitions: `East` and `West`.
**Decision:**
### **Grade matters?** Does the GPA or standardized testing determine a person's success?
```{r, fig.width=16,fig.height=8}
p4 <- dat |>
ggplot(aes(x = decision, y = gpa, color = decision)) +
geom_beeswarm(size = 2) +
scale_color_manual(
values = c("#1A6899", "#FC5449", "#FFCF58"),
labels = c("Admit", "Decline", "Waitlist")
) +
theme_ipsum_rc() +
theme(
legend.position = "bottom",
legend.title = element_text(color = "#000000"),
legend.text = element_text(color = "#000000"),
axis.title.y = element_text(
family = "IBM Plex Sans", face = "bold"
),
axis.title.x = element_text(
family = "IBM Plex Sans", face = "bold"
),
# axis.text.x = element_text(),
# axis.text.y = element_text(),
text = element_text(
family = "IBM Plex Sans",
color = "#3B372E"
),
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_markdown(hjust = 0.5),
plot.caption = element_markdown(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
title = "BEESWARM PLOT OF ADMISSION RESULTS",
subtitle = "BY DECISION",
caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
x = "decision",
y = "gpa (4.0)"
)
p5 <- dat |>
ggplot(aes(x = decision, y = test_score, color = decision)) +
geom_beeswarm(size = 2) +
scale_color_manual(
values = c("#1A6899", "#FC5449", "#FFCF58"),
labels = c("Admit", "Decline", "Waitlist")
) +
theme_ipsum_rc() +
theme(
legend.position = "bottom",
legend.title = element_text(color = "#000000"),
legend.text = element_text(color = "#000000"),
axis.title.y = element_text(
family = "IBM Plex Sans", face = "bold"
),
axis.title.x = element_text(
family = "IBM Plex Sans", face = "bold"
),
# axis.text.x = element_text(),
# axis.text.y = element_text(),
text = element_text(
family = "IBM Plex Sans",
color = "#3B372E"
),
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_markdown(hjust = 0.5),
plot.caption = element_markdown(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
title = "BEESWARM PLOT OF STANDARDIZED TEST SCORE",
subtitle = "BY DECISION",
caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
x = "decision",
y = "score"
)
p4 + p5
```
***
**Decision:**
### **Volunteer level.** Or it outweighted by work experience?
```{r, fig.width=12,fig.height=8}
p6 <- dat |>
ggplot(aes(x = work_exp, y = volunteer_level, color = decision)) +
geom_beeswarm(size = 2) +
scale_color_manual(
values = c("#1A6899", "#FC5449", "#FFCF58"),
labels = c("Admit", "Decline", "Waitlist")
) +
theme_ipsum_rc() +
theme(
legend.position = "bottom",
legend.title = element_text(color = "#000000"),
legend.text = element_text(color = "#000000"),
axis.title.y = element_text(
family = "IBM Plex Sans", face = "bold"
),
axis.title.x = element_text(
family = "IBM Plex Sans", face = "bold"
),
# axis.text.x = element_text(),
# axis.text.y = element_text(),
text = element_text(
family = "IBM Plex Sans",
color = "#3B372E"
),
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_markdown(hjust = 0.5),
plot.caption = element_markdown(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
) +
labs(
title = "WORK EXPERIENCE VS VOLUNTEER LEVEL",
caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
x = "work experience (years)",
y = "volunteer level (0-5)"
)
p6
```
***
**Decision:**
About {data-icon="fa-info" data-orientation=columns}
=====================================
Column {data-width=350}
-----------------------------------------------------------------------
The dataset contains 81 valid admission results from the CSV file `SummerStudentAdmissions2.csv`.
Three versions of this dataset are included on the right hand side:
- Standardized data.
- Cleaned data.
- Raw data.
Due to the lack of information, some of the variables and contents from the dataset are interpreted intuitively.
In the cleaned dataset,
- `gender=-1` means the gender is undisclosed.
- `volunteer_level` is ranked from 5 to 0.
- `gpa` is calculated on a 4.0 scale.
- `writing_score` should be on a 100 scale.
- `test_score` has rather limited information.
- `work_exp`'s unit is year.
***
The dashboard is powered by
- [`flexdashboard`](https://pkgs.rstudio.com/flexdashboard/)
- [`DT`](https://rstudio.github.io/DT/)
- [`plotly`](https://plotly.com/)
- The static visualization theme is customized based on [`hrbrmstr`](https://github.com/hrbrmstr/hrbrthemes).
***
The codes are open-sourced. **Please feel free to star or fork this repository.**
Column {.tabset data-width=650 data-height=1000}
-----------------------------------------------------------------------
### Standardized data
```{r}
DT::datatable(dat_stand,
options = list(
bPaginate = FALSE
),
style = "bootstrap"
) |>
formatStyle(
"decision",
backgroundColor = styleEqual(
c("Admit", "Decline", "Waitlist"),
c("#1A6899", "#FC5449", "#FFCF58")
)
) |>
formatStyle(c(
"gpa", "work_exp", "test_score", "writing_score",
"volunteer_level"
),
background = styleColorBar(range(c(
dat_stand$gpa, dat_stand$work_exp, dat_stand$test_score, dat_stand$writing_score,
dat_stand$volunteer_level
)), "lightblue"),
backgroundSize = "98% 88%",
backgroundRepeat = "no-repeat",
backgroundPosition = "center"
)
```
### Cleaned data
```{r}
dat <- read_csv("data/data-cleaned.csv")
DT::datatable(dat,
options = list(
bPaginate = FALSE
),
style = "bootstrap"
) |>
formatStyle(
"decision",
backgroundColor = styleEqual(
c("Admit", "Decline", "Waitlist"),
c("#1A6899", "#FC5449", "#FFCF58")
)
)
```
### Raw data
```{r}
DT::datatable(read_csv("data/SummerStudentAdmissions2.csv"),
options = list(
bPaginate = FALSE
),
style = "bootstrap"
)
```